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This paper presents the theory for a rapidly converging adaptive 
linear digital filter. The filter weights are updated for every new 
input sample. This way the filter is optimal (in the minimum mean 
square error sense) for all past data up to the present, at all instants 
of time. This adaptive filter has thus the fastest possible rate of 
convergence. Such an adaptive filter, which is highly desirable for 
use in dynamical systems, e.g., digital equalizers, used to require on 
the order of N 2 multiplications for an N-tap filter at each instant of 
time. Recent "fast" algorithms have reduced this number to like 10 
N. One of these algorithms has the lattice form, and is shown here to 
have some interesting properties: It decorrelates the input data to a 
new set of orthogonal components using an adaptive, Gram-Schmidt 
like, transformation. Unlike other fast algorithms of the Kalman 
form, the filter length can be changed at any time with no need to 
restart or modify previous results. It is conjectured that these prop- 
erties will make it less sensitive to digital quantization errors in 
finite word-length implementation. 

I. INTRODUCTION 

Gradient algorithms are widely used in adaptive tapped delay-line 
filters, such as equalizers, to derive a set of tap coefficients that gives 
the desired output with a minimum mean square error (mmse). It is 
widely recognized 1 that when the input samples presented to the 
adaptive system are highly correlated, convergence to the optimum 
filter coefficients is slow. An important contribution to solving this 
problem of slow convergence was made by Godard 2 who obtained an 
adaptive algorithm that minimizes the total mse at all instants of time. 
Consequently, the Godard algorithm has the fastest possible rate of 
convergence in an mmse sense, and is usually referred to as the optimal 
mean-square adaptive estimator. This algorithm has the structure of 
a Kalman filter, and its complexity is on the order of N 2 multiplications 
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and additions per iteration, where N is the number of filter coefficients 
being adjusted. Fast convergence results when successive corrections 
to the coefficients' vector are adaptively decorrelated. Based on this 
observation, other practical, less complex, schemes of orthogonalizing 
the corrections were proposed, see for example Ref. 1. Recently, an 
efficient (or "fast") computing procedure, called the fast Kalman 
algorithm, was obtained which provides a fast-converging estimator 
identical to that of Godard, but which requires only on the order of 10 
N multiplications. 3,4,6 

Another approach to accelerated convergence is to transform the 
input data to obtain uncorrelated inputs to the estimator. 6 When the 
characteristics of the channel are fixed and known, the transformation 
can be found from the data autocorrelation matrix. When this matrix 
is unknown, the transformation has to be adaptive. Since the lattice 
structure, whose computational complexity grows only like N, is known 
to generate "white" uncorrelated outputs by a process called inverse 
filtering that keeps removing correlated components from the input 
signal, 7 " 11 it has been proposed for this application. However, the 
outputs of the lattice structure are uncorrelated only after it has 
converged to its steady state; therefore, it may not converge as fast as 
the Godard algorithm. Recently, Morf was able to formulate the lattice 
algorithm in a special form such that its outputs are uncorrelated in 
the mean square sense for all instants of time. 12,13 Our purpose is to 
extend Morf's works to compute an adaptive estimator which is 
equivalent in performance to Godard's. Moreover, we will demonstrate 
that the computational complexity of the adaptive lattice algorithm 
compares well with the fast Kalman algorithm of Falconer and Ljung. 5 
The advantage of the lattice structure is the ease of changing the 
number of coefficients. It is also conjectured that the lattice algorithm 
will be less sensitive than the Kalman algorithm to finite-precision 
digital implementation. Recently, this was observed in Ref. 14. It is 
also discussed in Ref. 15, where the development of an equalizer based 
on the adaptive lattice algorithm is presented in a form similar to the 
one given here. One case to illustrate this property of the lattice will 
be given at the end of this paper. 

In the next section, the optimal least mean square estimator and 
predictor are precisely defined, and the minimal error that results is 
given. In Section III, several properties of the optimal predictor are 
explored and are related to the estimation problem. In Section IV, an 
efficient (in the sense of small number of computations) lattice form is 
derived, using the relations developed in Section III, that maintains 
the optimal convergence. In Section V, the properties of this lattice 
form are compared to the steady-state lattice structure. Suggestions 
for further work are also included. 
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II. OPTIMAL MEAN SQUARE ESTIMATION 

2. 1 Notation and definitions 

Given a discrete time input data sequence (yi) i = 0, 1, • • • , it is 
desired to find the set of weights for a transversal tapped delay-line 
filter such that the output of this filter be a good estimate of another 
sequence {di}. An adaptive equalizer, for example, has the received 
signal as its input, while its output should provide an estimate of the 
transmitted data. In a transversal filter, a vector of filter coefficients, 
of length p + 1, operates on vectors of data that are shifted versions of 
the input data for time < t < T defined by 

y'p.r = (yr, yr-i, • • • , yr- P ), (l) 

with y l being the transpose of y and it is assumed that j, = for i < 0. 
As we are concerned with an adaptive, i.e., time varying filter, its 
weight vector of order p + 1 will be denoted 

w' P ,T = [w p ,t(0), w p ,t(1), ••• , w p ,t(p)]. (2) 

Using these definitions, the output of the p + 1 long linear estimator 
at time T is d p ,r given by 

d p ,T = w p , T y p ,T. (3) 

Now suppose that w p ,t is the best predictor for time T, then the total, 
or accumulated, mean square estimation error up to time T, when 
using this predictor, is given by 

E(w p , T )= E (di-w p , T yp,i) 2 . (4) 

j-0 

The sequence of weight vectors that minimizes eq. (4) at every instant 
of time T, is the most rapidly converging sequence, and is called the 
optimal adaptive filter. 

Making use of the following time-domain definitions of the cross 
correlation vector and the autocorrelation matrix, 

T 

£ di y P ,i = g P ,r (5) 

1=0 
T 

I y P .iy P ,i = Rp.t, (6) 

i=0 

one obtains 

T 

E(w PlT ) = 2 d ? ~ 2w P ,Tgp,T + w' Pi tR p ,tw p ,t. (7) 

j=0 

2.2 The optimal estimator 

Equating the gradient of eq. (7) with respect to w p ,t to zero gives 

Rp,TW p ,T = g P ,T- (8) 
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It is seen that solving for the optimal estimator is equivalent to 
inverting a matrix at every new sample point: 

W P ,T = Rp?Tgp,T. (9) 

Godard showed that w p ,t can be updated with on the order of p 2 
calculations, 1,2 which is an improvement over the simple matrix inver- 
sion, requiring on the order of/) 3 calculations. Algorithms that require 
only on the order of lQp operations for obtaining the optimal estimator 
appeared 3,4 subsequently, and are called "fast" algorithms. 

The essence of this paper is to derive the fast algorithm in a special 
form, called the lattice form. This form was proposed to speed the 
convergence of the weight vector of an adaptive predictor to its optimal 
value (see Refs. 7 to 11). As will be seen in the next paragraph, the 
estimator problem is closely related to the prediction problem. 

From eqs. (7) and (8) it is seen that the minimal total mse that 
results using the optimal estimator is simply 

T 

E opt (w P , T ) - J df - g P ,TRp,Tg P ,T. (10) 

;=o 

This is in constrast to the adaptive gradient algorithm whose perform- 
ance is more difficult to analyze. 

2.3 Optimal prediction 

The problem of prediction is more basic, but similar to the problem 
of estimation. Solving this problem will be shown to simplify the 
solution of the estimation problem. For linear prediction, a set of 
weights is used to linearly estimate the present input point from past 
values of the input data. Let the set of p weights at time T be 
{— a p ,r(l), — a p ,r(2), • • • , —o>p,t(p)}, so that the error when predicting 
the input point yi is given by 

ep,i = A p , T y P ,i, (11) 

with 

A p ,t = [1, 0Sp.rU), • • • , Op.t(p)1 U2) 

The error generated in predicting the input is that part of the input 
which is uncorrelated to past values of the input. This is a desired 
feature for fast convergence of adaptive filters. 

The total square error up to time T is, thus, given by 

T 

I eh = A p , t R p ,tA p ,t. (13) 

i"=0 

Taking derivatives with respect to a p ,r(\) to ci p ,t{p), it is found that 
the predictor weight vector that will minimize the total mse up to time 
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T is the solution of the last p equations of the expression 

r p tA p . t =(*; t Y d4) 

with Rp, T yet unknown and P = (0, • • • , 0)' vector of order p. Using 
this optimal predictor in eq. (13) gives 

J eh = A' P , T R P , T A P , T - A P , T (ffiA = R e P , T . (15) 

Therefore, R PiT is the minimal total mse that will result. 

As before, obtaining the optimal predictor A Pi t for all T is equivalent 
to inverting the matrix R P , T for all T. An efficient algorithm for doing 
this will be described. It should be noted, from comparing eqs. (8) and 
(14), that the latter is a simpler "homogenous" set of equations, except 
for the end term R p ,t', therefore, its solutions can serve as a basis for 
the solution of eq. (8). 

III. DERIVATION OF THE ORDER AND TIME UPDATE RELATIONS 
3. 1 Time shift properties of R PiT 

The vectors v Pi r for successive values of T are shifts of each other. 
As these vectors build up the matrix R P , T in eq. (6), it is expected that 
shifted versions of the solutions to the predictor and estimator equa- 
tions will serve in updating these solutions. For doing this, the shift 
properties of R p ,t are explored. For the (i, j) term in eq. (6), we have 

T 

R p .t(U J) = I yk+i-tyk+i-j = R P -\AU j) 

forallp- l>i, j> 1 (16) 

T T-l 

R P ,Ai + l, y + l) - S yk-iyk-j = £ VA+i-iVA+i-y 

k-0 k 1 

T-l 

= 2 yk+i-iyk+i-j - Rp-i.T-i(U j) 

forallp- l>i, j> 1, (17) 

where the fact that .y, = for i < was used. Using Morf 's notation, 
these relations can conveniently be written as 

R p -\,t X\ (XX 



*~-{r~ x)=[i i-,,-,j- < i8 > 

with X being any other term in the matrix. It is clear from eq. (18) that 
R p ,t is symmetric, but not Toeplitz, if steady state is not reached. 
Therefore, the properties of Toeplitz matrices cannot be used, as is 
done for example in claiming fast convergence in Ref. 7. 
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3.2 Order update relations 

It will be useful to define, similar to eq. (12), a backward prediction 
vector 

B'p.r = (b P .T(p), b P .r(p ~ 1), • • • , bp,Al), 1), (19) 

with the backward error given by 

r p , T = B' p . T yp.T, (20) 

i.e., it is the error in predicting yr- P from yr to yr-p+i- To minimize the 
total mean square backward prediction error up to time T, B p ,t should 
be the solution of 

V 



*>****= \R PtT )- (21) 

It is seen that this is another set of homogenous equations, except 
for the lower one. Again, the optimal error r p ,r will be orthogonal to 
yr- P +i, ••• ,yr. 

A recursive procedure will be derived in the Appendix for generating 
solutions to eqs. (14) and (21) for increasing order p. 

It is shown to be 

^w-(o* T )-^*»-(Lj (22) 

for k P ,T as defined in the Appendix, and the higher order total error is 
Rp\i,t = Rp,t - kl T R*T-i = Rp,Al - klTRpfrRjfr-i). (23) 

As increasing the predictor order would not increase, and typically will 
decrease the error, it should be that 

1 > k% T R»TRpfr-i > 0. (24) 

Similarly, 

Bp+i,t -( o )- k P , T Rp?T (***\ (25) 



Rp.T-l 



and 



Rp+i,T = Rp,T-i — k P ,rRpj. (26) 

Similar relations hold for the prediction error, when multiplying eqs. 
(22)and(25)by^ p+ i,T 

e p +i,r = e p ,r — k p ,TR P !r-if p ,T-\ (27) 

f P +i,T — r p ,T-\ — k p ,TRp/re p ,T- (28) 

The following auxiliary quantities are needed 

Cp.T-R&ypjr (29) 
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y P ,T - C'p.ryp.T = y P ,TR P }ry P ,T. (30) 

The order update of these quantities will also be derived in the 
Appendix. It is shown to be 



Cp+1,T = \(\ P I "*" r p+l.TRp+l,TB p +i,T 

C P ,T\ . T» / X 



with jUp+i.r defined by 

fr+1,7 — fp+i.rRp+i.T (32) 

and, as seen, is the last term of Cp+i,r. 

3.3 Time update relations 

To obtain the time update of A Pi t, use is made of the following: 

Rp,t + i = R P ,T + yp.T+iy'p.r+i. (33) 

This relation is shown to give 

'0 

K Cp-l,T 

Here, the definition 

e P ,T+\ = Ap, T yp,T+i (35) 

is used for the tentative prediction error before updating the prediction 
coefficients. As for the minimal total mse, it is updated according to 

Rp,T+l = Ap,T I Q.p' J = Ap,TRp,T+lAp,T+\ 

= Ap.ARp.T + y P ,T + iyp.T + i)Ap, T+ i = R e P . T + e°p,T + ie p ,T + i. (36) 

It should be mentioned here that only in the stationary case e p ,r+i = 
cjr+i and, thus, R e P ,T+i = R p .t + e% T +v As for updating B P , T , two dif- 
ferent possibilities are derived in the Appendix: 



A p .T+i-A p .T-e° p ,T+i\r, )• (34) 



D _ D r (Cp-l,T+l\ 

rtp.T+i — Bp.T — r p ,r+i I q I 



(37) 



or 



B p ,t+i = (B p ,t — r p ,T+iC p ,T+\) x o • (^' 

1 — r p,r+iMp,r+i 

From eq. (37), a relation like eq. (36) can be obtained 

Rp,T+\ = Rp.T + fp,T+\fp,T+l' (39) 
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Again, r PiT +i is the tentative backward prediction error. The time 
update of C p ,t is also obtained in the Appendix. 
From these results, a simple update for k Pi r is found to be 

k p ,T+i = k p ,T + e p ,T+ir p ,T. (40) 

or alternatively, from eqs. (79) and (83) 

k p ,T+i = k p ,T + e p ,T+ir p ,T(l — y P -\,r) = k Pt r + e p ,T+\r p ,T. (41) 

All these relations are derived in the Appendix. They form the basis 
for the lattice network which update these quantities both in order 
and time. 

IV. EFFICIENT CALCULATION OF THE OPTIMAL ESTIMATOR 
4. 1 Tapped delay line estimator 

The optimal estimator for any order p and for each time T is given 
in eq. (9). Using eqs. (5) and (6), we get 

R p ,t+iw p ,t+i = gp.r+i = g P ,r + dr+iyr+r 

= R Pi tW p ,t + dr+iyr+i 

= R p ,t+iw p ,t + (d T +i - w p , T y P ,T+\)y P ,T+i. (42) 
Therefore, 

w p ,t+i = w p ,t +(dr+i — w p ,Ty P ,T+i)R P ,T+iy P ,T+i 

= w p ,t + (d T +i - d PtT +i)C p ,T+i. (43) 

Note that updating w PiT involves the tentative estimate ^ p ,t+i = 
w p , T y P ,T+i using the new data and present estimator weights. This 
makes it possible to implement this scheme in decision-directed equal- 
izers, where the decision on which dr+i was transmitted is based on 
d p ,T+i. Also note that the correction to w p ,t is in the direction of C p ,r+i 
= R P }r+iy P ,T+i rather than j^.r+i, as in the gradient algorithm. These 
vectors are parallel only if R p ,t+\ is a unit matrix times a scalar; thus, 
all its eigenvalues are equal. When this is not the case, and y P ,r+\ 
contains eigenvectors corresponding to different eigenvalues, R p ,t+i 
equalizes the gains for these vectors. Also note the similarity in the 
updating equations (34), (37), and (43) which is to be expected, since 
prediction is a special case of estimation. 

The fast Kalman algorithm is an efficient recursive procedure to 
obtain C p ,t+i, This is given in Ref. 4 as follows: 

1. Assume that all vectors are available up to and including 
time T. 

2. Use eq. (35) to obtain e p , T +i = A PiT y P ,T+\. 

3. Use eq. (34) to calculate A PiT+ i = A P , T - e p T +i( n 

\Cp-i,T 
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4. Use eq. (11) to calculate e p ,r+\ ™ A p ,T+iy P ,T+i. 

5. Use eq. (36) to calculate R p .t+\ = Rp.r + e p ,r+ie p .r+i. 

6. Calculate e p ,T+iRp/r+i- 

7. Use eq. (89) to calculate 

C p ,t+\ = I n I + e p ,r+ii?p.T+iAp,r+i. 
\C p -i.r/ 

8. From eq. (31), /x Pl r+i is found. 

9. Find rj, r+ i = fi&rjhr+i. 
10. Use eq. (38) to calculate 

B P ,T+1 — (B p ,T — fp.T+lC p ,T+l) 



1 "~ fp.T+lllp,T+l 

11. Use eq. (73) to calculate 

C p -\,T+\ 



f) | — Cp,T+\ ~ Hp,T+lBp,T+l. 

12. Calculate the tentative estimate d p ,r+i = w p ,Ty P ,T+i- 

13. Use eq. (43) to update the estimator weights 

W p ,t+i = iv p ,t + (dr+i — dr+\)C p ,T+\. 

The initial conditions, when there is not enough input data so that 
R Pi t in eq. (6) does not have an inverse, are discussed in Ref. 4. There 
are lOp + 5 multiplications, 9p + 4 additions, and 2 divisions for one 
complete updating cycle. Note that there are no matrix operations, 
only additions and products of scalars and vectors is involved. By 
comparison, the simple fixed-step gradient algorithm requires 2p + 1 
multiplications and 2p + 1 additions per cycle. 

4.2 Lattice structure 

Here an equivalent algorithm that also gives the optimal estimator 
with about the same number of computations is derived. It is assumed 
that the input data are transformed by the lower triangular transfor- 
mation matrix 



Bo,T B\,T 

P 0' 



''p< T = I op ap : i ••• ■Bp.r). (44) 



From eq. (20), the transformed data are 

Lp, T yp,T = (ro.rri.r • • • r p , T )' = r p , T . (45) 

Note that as the dimension p increases, new terms are included in r, 
but the previous ones are unchanged. This is an important property of 
the lattice algorithm that enables us to change the estimation order 
without the need to recalculate all previous values, as for example in 
the fast Kalman algorithm. As before, we let the tentative transformed 
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vector be 

r%T+i - L p , T y P ,T+i. (46) 

Define a new vector of weights 

H t p ,T=i.ho,Th 1 ,T'--h p , T ), (47) 

that now operates on />,r to give the estimate 

4.7" = H p , T r p , T = H p , T L PlT y P ,T. (48) 

It is seen that for this estimate to be equivalent to the optimal 
estimator, w p< t is the transform of H p ,t, and from eq. (8) 

L p ,Tg P ,T = L p ,tR p ,tw p ,t = L p ,tR p ,tL p ,tH p ,t. (49) 

Using eq. (21), it can be shown that 

(R r o, T 1 0" \ 

R p ,tL p ,t=[ Ri,t ••• , (50) 

\ X X R p ,tI 

which is a lower triangular matrix. The product L p ,tR p ,tL p ,t is, thus, 
a symmetric product of two lower triangular matrices; therefore, it 
should be symmetric, lower triangular, and diagonal, i.e., 

L p ,tR p ,tL p ,t — D p ,t. (51) 

The diagonal terms are easily found using eq. (50) 

D P , T (i, i) = [£lr(0 p -V] I RIt I = R r i,r (52) 

and, again, they are independent of p. 

At this point, a closer inspection of L p ,t is of interest. It is a lower 
triangular matrix with 1 on the main diagonal. Therefore, L p }r has the 
same structure. Therefore, eq. (45) can be rewritten as 

r P ,T = y P ,T - {L P } T - I P )r P ,T. (53) 

This can be looked upon as a Gram-Schmidt procedure to calculate 
new orthogonal components of F p ,t from the components of y p ,T, minus 
their projections on the previous r p<T components. Thus, eq. (51) 
represents the fact that the autocorrelation matrix of the transformed 
data is indeed diagonal. 

Using eqs. (49) and (51), H p ,t can be found from the transformed 
gp.r by 

H p ,t = D p }rL p ,Tg P ,T. (54) 

It should be noted that only scalar divisions rather than matrix 
inversion is needed here and increasing the order of the lattice 
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estimator does not change previously calculated values of H. This is 
why double indices are used in eq. (47) as compared to triple indices 
in eq. (2). Equation (54) can be broken top scalar equations 

h p ,T — Rp!rB p ,Tgp,T- (55) 

To see how the right-hand side develops in time, define 

Pp,t = B p ,Tgp.T. (56) 

Then from eqs. (37), (5), (29), (9), and (3) in that order 

Pp,T+l = B Pl T+lgp,T+\ 

— [Bf>.T ~ fp,T+\(C p -i,T+lO)](gp,T + dT+iy p ,T+\) 

= Pp.t + (dr+\ — d p -i,T+i)rp,T+i'= p p ,t + V p -i,T+\r p ,T+i , (57) 

where V p ,t is the estimation error after the pth order estimator. For 
p = 

Po,T+i = go,T+i = Par + dr+iyr+u (58) 

i.e., tf-i.r+i s 0. Obviously, 

V Pt T = dr— d p ,T = dr— H p ,Tr P ,T = Vp-w — h p ,Tr p ,T. (59) 

The recursive solution to eq. (55) that corresponds to eq. (43) is: 

h p ,T+l = Rp!r+\(p P ,T + V p -l,T+ir p ,T+V 

= Rp!r+\[(Rp,T+i — r Pt T+ir p ,T+i)h p ,T + V p -i,T+ir p ,T+i] 

= h p ,T + Rp!r+i(V p -i,T+i — hp t Tr p ,T+i)r Pi T+i- (60) 

This is equivalent to the first tap of the conventional tapped delay line 
equalizer for p = only. The tentative estimate as in eq. (43) is now 

^p.r+i = w P.Typ.T+l = Hp.rLp.ryp.T+l = H Pt Tr~p,T+\. (61) 

The minimal total squared error is from eq. (51) 
E p ,t = £ dl - gp.rRph-gp.T =ldf- (Lp,Tgp.T) t D- 1 r(Lp,Tgp,T) . (62) 

j-0 i-0 

From the structure of D and L it follows that 

Ep+\,T = E p ,T — (B p +\,Tgp+l,T) Rp+1,T = Ep.T — P p+l.rR p+l,T '• (63) 

Using eq. (57), the residual error can be found for all instants of time. 
It is then simple to decide whether p should be increased, decreased, 
or unchanged to meet the desired performance. As mentioned before, 
when adding or deleting sections no recalculation of the coefficients is 
needed. 

The procedure for recursively obtaining the estimator H p ,t and the 
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estimate d P ,r+i is as follows: 

1. Assume that all quantities are known up to and including 
time T. 

2. Start with e p , T +i = e PiT +i = r p , T +i = r p , T +i = JT+i for/) = 0. 

3. Use eqs. (22) and (25) to compute 

ei.r+i = eo,r+i — (ko,TRo!r-.)fo,T 
fi,T+i = fo.T ~ (^o,7 , ^o,7 , )^o,r+i- 

4. Use eq. (26) to compute ko,r+i = ko.r + eo,T+ir ,r. 

5. Use eqs. (36) and (39) to compute 

Ro,t+i = Rq,t + eo.r+ieo.r+i 

Ro,T+\ = Ro.T + fQ,T+lfo,T+l- 

6. Compute the gain terms ko,T+iRo,T ko,T+iRo,T+i to obtain from 
eqs. (27) and (28) 

ei.T+l = eo,T+l — {ko,T+lR 0,T)f"o,T 

ri,r+i = /"o,r — (^o.r+i^o.r+Oeo.r+i- 

These gain terms can be saved for the next recursion. 

7. Repeat steps 3 to 6 for/? = 1, 2, 

8. Use eq. (61) to compute the tentative estimate 

JO _ rjt -o 

Op,T+l — tlp,T r p,T+\- 

9. To update H p ,t start with V_i,r+i = o?r+i from eqs. (58) and (59) 
and use eq. (60) to compute 

ho,r+i = ho.T + Ro,t+i( V-i,t+i — ho,Tro,T+\)r o,t+i. 

10. Use eq. (59) to compute 

Vo.r+i = V-i.r+i — Ao,r+i'"o,r+i' 

11. Repeat steps 9 and 10 for p = 1, 2, 

12. Use eq. (48) to compute d Pi r+i = H l p ,T+ir p ,T+i- 

Steps 3, 6, 8, and 10 can be drawn in a block diagram like in Fig. 1. The 
variable gain terms are k p , T R P j--\, k p , T R P ,T, and h PiT , and they are 
updated in steps 4, 6, and 9. When the system reaches a steady state, 
it can be illustrated in a simpler form as shown in Fig. 2. 

4.2. 1 Starting the algorithm 

The problem of finding the optimal predictor/estimator of order p 
is not well defined if there are less than p input points. Therefore, 
when starting the algorithm, p should be in the first recursion, 1 in 
the second, and p should grow linearly in time until it reaches the 
desired number of sections of the lattice. This is in contrast with the 
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Fig. 1 — The basic form of the lattice estimator. 

fast Kalman algorithm, where a small diagonal matrix is assumed for 

Rp,o- 

4.2.2 Number of operations 

The number of operations required for the three algorithms, the 
simple gradient, the fast Kalman, and the lattice, are given below 
where p is the number of adaptive parameters: 



Algorithm 



Multiplica- 
tions 



Additions Divisions 



Gradient 
Fast Kalman 
Lattice 



2p 
10p 
12p 



2p 
Up 



2 
3p 



V. DISCUSSION 

It was shown in eq. (9) that the optimal linear estimator that yields 
the least total mse is obtained by matrix inversion. A recursive algo- 
rithm to update the optimal estimator also involves an inversion of 
the correlation matrix of the data as in eq. (43). If the input data are 
uncorrelated (i.e., low signal embedded in flat noise, or data signal with 
Nyquist spectral shape), then multiplying by Rp] is equivalent to 
scalar division, which is the simple gradient algorithm. However, if the 
data are highly correlated and R p ,t has its eigenvalues spread out 
(AmaxAmin 28> 1), then the optimal recursive algorithm for the fastest 
convergence of the estimator is more complex: The estimator can still 
have the form of a tapped delay line, but now the shift properties of 
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Fig. 2 — The basic form of the steady state lattice estimator. 



R p< t are used to update w p ,t in 0(/>) multiplications. A different 
approach demonstrated in this paper is to transform the input data 
using the lattice network to get uncorrelated inputs to the estimator. 
The estimator weights are now different but related to the original set 
through the same transformation eq. (48). This is similar to Ref. 6, 
except that the transformation matrix is time variant. The performance 
of the nonstationary lattice and the fast Kalman are the same — both 
give the minimal error — and in Ref. 2 it is demonstrated that conver- 
gence time can be reduced by a factor of 15, compared to the simple 
gradient algorithm. 

The differences between the lattice and the fast Kalman algorithms 
in practical, finite precision digital implementation should be fully 
discussed elsewhere. However, an example can be given here. Exam- 
ining step 5 for the Kalman algorithm, eq. (36) may occasionally render 
Rp.r+i which is nonpositive because of the accumulation of arithmetic 
errors. The algorithm is useless from that time on, and has to be 
restarted. On the other hand, for the lattice algorithm, step 5, if either 
Rp,T+i or R p ,t+i become nonpositive, force them to be some small but 
positive number, and make all ki,r+i equal zero for i 5= p. The updating 
algorithm then falls back to the gradient algorithm from tap p and on, 
or the filter length can be shortened to length p, as desired. 

Future work should try and make use of the above recurrence update 
relations for the exponentially weighted errors, the "fading memory" 
case under time-varying situations. Also simpler, suboptimal algo- 
rithms can be derived and should be compared to the exact algorithm 
in terms of performance and complexity. 
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VI. CONCLUSIONS 

(i) The lattice algorithm gives identical results to the fast Kalman 
algorithm for adapting filter coefficients when both have the same 
number of coefficients. 

(ii) The number of multiplications for the two algorithms is about 
the same, but the lattice requires more divisions for normalization by 
the residual error energy at each stage. 

(Hi) Changing the number of taps is easier under the lattice algo- 
rithm. 

(iv) In limited-precision implementation under severe amplitude 
distorting channels, the last property of the lattice algorithm may be 
valuable in providing better performance. 

APPENDIX 

A. 1 Derivation of the order update of A Pi T 
From eqs. (14), (18), and (20), it is found that 

w(Y)=(Yx)(Y)=([j. <«» 

and similarly 

for some k p ,T,k' p ,T. From the fact that A PtT (0) = B p , T -i{p) = 1 it can be 
shown that 



k P ,r = (0 B p , T -i) f <?* ) = (0 B p , T -x)R P+ i,T ( A * T j 

= (k p ,T(0 P )'Rp.T-l) ( A $ T j = k' p , T . 

Combining eqs. (64) and (65) in a proper way, it is found that 



(66) 



R P +1,T 

with 



(Y)-^^(b1)] = (^ t )- < 67 > 



R-Jt-i = (R p ,t-i)-\ (68) 
Therefore, 

^-(V)-*^-'(b1,)- (69> 
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A.2 The order update of C p , T 
For the order update of C p ,t, note that 



and 



R + ' I C o T ) - Vx I <70) 



QP+l 
Rp+1,tB p+ i iT = [ j^ r ). (71) 



From this it can be seen that C p +\,t is a linear combination of I P ' T 

and B p +i,t- 
From the relation 

((0 P ) R p +i,t)C p +i,t — B p +\,tRp+i,tC p +\,t 

= B P +\,ry P +i,T = t°p+i.t, (72) 

it then must be that 

Cp+l,T = In')"*" r P+l.TR p+i,tB p +i,t 
_ lC p , T \ „ ( X 



.+H p+ x,tB p+1 , t = I , (73) 

/ \Mp+i,r/ 

with the definition 

Hp+i,t= r p+ i,r-ffp+i,r, (74) 

i.e., jUp+i.T- is the last term of C p +\,t. 
Therefore, 

Yp+i.t = C p +i,Ty P +i,T = yp.r + r p+i,tR~p+i,t, (75) 

and, thus, 

y P ,T= S r?,r-RZr. (76) 

i-O 

A. 3 The time update relations 
From eq. (33) the time update of A p ,t is obtained as follows: 

Rp,T+\A p ,T = (Rp,r + y P ,T+\y p,t+\)A p ,t 

_ I R'p.T\ , „ n 



q P i + y P .T+ie p ,T+i 



= ( R ^)^^( Cp \ T )e",^ (77) 
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for some R p ,t+u with the definition (35) for e p , T +i. From eq. (77), it can 
be seen that 

A p , r+ i = A P , T - e° p ,T + i ( c ° J , (78) 

and multiplying both sides by y p ,r+\ gives the relation 

e p .r+i = e°, r+ i(l - y p - iT ) for p - 1, 2, • • • (79) 

As for p = 0, we get from the definition that 

eo.r+i — e o,r+i or y-i,T s 0. 
The time update of B p ,t is obtained similarly: 

Rp,T+\Bp,T = (Rp.T + y P ,T+iy P ,T+l)B p> T = I pr I + .Vp.T+irp.T+l 

° P \ + R P ,T + i( Cp -Z T+1 )r°P.T + i. (80) 



(81) 



-tlp,r+I / \ u 

Therefore, 

D _ D ,.0 /Qp-l.T+M 

&p,T+\ — -Op.r — ^p,r+i I q I • 

As in eqs. (36), (78), and (79), 

Rp,t+i = Rp.r + r Pi T+ir p ,T+\ (82) 

and 

r p ,T+i = r P ,T+i(l - Yp-i,t+i) for p = 1, 2, • • • (83) 

and 

fO,T+l = fo,T+l = yt+i- 

The time update of B p ,t can also be obtained as follows: 
Rp.r+iBp^T^ (R p ,t+ y P ,T+iy P ,T+i)B p ,T 

= nr J +yp.T+\r p ,T+l = [ pr I + Rp,T+lC p ,T+\r p.T+\. (84) 



Thus, 



B p ,t+i — (B p ,t— r P .T+\C p ,T+\) X- 5 , (85) 

1 ~ fp.T+l^p.T+l 



where the denominator is chosen to make b p ,T+i(0) = 1 using the 
definition (32). The time update of C p ,t is obtained as in eqs. (70) to 
(73). 
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and 

R P ,T +1 A PlT+ i = ( R l T P +1 ). m 

Therefore, C p ,t+i is a linear combination of 



C P -\,T , 

and A p ,t+i. From 

[^p,r + l(0 P )']Cp,T + l = A p ,T + lR P ,T + lC p ,T + l - -Ap.T+lVp.r+l = Cp.T+1, (88) 

it is found that 

C p ,t+\ =\ n I + ep.r+i-Rp.r+iAp.T+1. (89) 



Multiplying by v p ,r+i gives 

Yp,t+i = Yp-i.t + ep,T+iR P /r+i (90) 

and 

p 

Yp,T+l = 2 6i,T+i+l- p R7,T+i+l- P - (91) 

i-0 

For the time update of K p ,t, use the definitions (64) and (65): 

k PtT+ l = [k p ,T + l(0 P YR P ,T] ( A $ T j 

= (0 B p ,t)(R p +i,t + y P +\,r+iy P +i,T+i) I q 
(R P A 

= (0 B P , T ) p + r p , T e p , T+ i = k p , T + e p , T+ ir p , T (92) 

\k p , T / 

Using eqs. (79) and (83) an alternative form is 

k p ,T+i = k p ,T + e p ,T+ir p ,T(l — Yp-i,r) = k p ,r + e p ,T+\r Pl T. (93) 
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